Multiplicative LSTM for sequence modelling
نویسندگان
چکیده
We introduce multiplicative LSTM (mLSTM), a novel recurrent neural network architecture for sequence modelling that combines the long short-term memory (LSTM) and multiplicative recurrent neural network architectures. mLSTM is characterised by its ability to have different recurrent transition functions for each possible input, which we argue makes it more expressive for autoregressive density estimation. We demonstrate empirically that mLSTM outperforms standard LSTM and its deep variants for a range of character level modelling tasks, and that this improvement increases with the complexity of the task. This model achieves a test error of 1.19 bits/character on the last 4 million characters of the Hutter prize dataset when combined with dynamic evaluation.
منابع مشابه
Optimizing and Contrasting Recurrent Neural Network Architectures
Recurrent Neural Networks (RNNs) have long been recognized for their potential to model complex time series. However, it remains to be determined what optimization techniques and recurrent architectures can be used to best realize this potential. The experiments presented take a deep look into Hessian free optimization, a powerful second order optimization method that has shown promising result...
متن کاملUnderstanding Gating Operations in Recurrent Neural Networks through Opinion Expression Extraction
Extracting opinion expressions from text is an essential task of sentiment analysis, which is usually treated as one of the word-level sequence labeling problems. In such problems, compositional models with multiplicative gating operations provide efficient ways to encode the contexts, as well as to choose critical information. Thus, in this paper, we adopt Long Short-Term Memory (LSTM) recurre...
متن کاملLong Short-Term Memory
Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, ...
متن کاملRESOLUTION METHOD FOR MIXED INTEGER LINEAR MULTIPLICATIVE-LINEAR BILEVEL PROBLEMS BASED ON DECOMPOSITION TECHNIQUE
In this paper, we propose an algorithm base on decomposition technique for solvingthe mixed integer linear multiplicative-linear bilevel problems. In actuality, this al-gorithm is an application of the algorithm given by G. K. Saharidis et al for casethat the rst level objective function is linear multiplicative. We use properties ofquasi-concave of bilevel programming problems and decompose th...
متن کاملA Study on LSTM Networks for Polyphonic Music Sequence Modelling
Neural networks, and especially long short-term memory networks (LSTM), have become increasingly popular for sequence modelling, be it in text, speech, or music. In this paper, we investigate the predictive power of simple LSTM networks for polyphonic MIDI sequences, using an empirical approach. Such systems can then be used as a music language model which, combined with an acoustic model, can ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1609.07959 شماره
صفحات -
تاریخ انتشار 2016